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Abstract 

Adaptive dynamics so far has been put on a rigorous footing only for clonal in- 
heritance. We extend this to sexually reproducing diploids, although admittedly still 
under the restriction of an unstructured population with Lotka-Volterra-like dynamics 
and single locus genetics (as in Kimura's 1965 infinite allele model). We prove under 
the usual smoothness assumptions, starting from a stochastic birth and death process 
model, that, when advantageous mutations are rare and mutational steps are not too 
large, the population behaves on the mutational time scale (the 'long' time scale of 
the literature on the genetical foundations of ESS theory) as a jump process moving 
between homozygous states (the trait substitution sequence of the adaptive dynamics 
literature). Essential technical ingredients are a rigorous estimate for the probability 
of invasion in a dynamic diploid population, a rigorous, geometric singular perturba- 
tion theory based, invasion implies substitution theorem, and the use of the Skorohod 
Mi topology to arrive at a functional convergence result. In the small mutational steps 
limit this process in turn gives rise to a differential equation in allele or in phenotype 
space of a type referred to in the adaptive dynamics literature as 'canonical equation'. 
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1 Introduction 



Adaptive dynamics (AD) aims at providing an ecology-based framework for scaling up from 
the micro-evolutionary process of gene substitutions to meso-evolutionary time scales and 
phenomena (also called long term evolution in papers on the foundations of ESS theory, 



that is, meso-evolutionary statics, cf Eshel (1983 in press); Eshel, Feldman, and Bergman 



fll998D ; |Eshel and Feldnianl Q2001| )). One of the more interesting phenomena that AD has 
brought to light is the possibility of an emergence of phenotypic diversification at so-called 



branching points, without the need for a geographical substrate Metz et al. (1996); Geritz 



et al. (1998); Doebeli and Dieckmann (2000). This ecological tendency may in the sexual 
case induce sympatric speciation Dieckmann and Doebeli (1999). However, a population 



subject to mutation limitation and initially without variation stays essentially uni-modal, 
closely centered around a type that evolves continuously, as long as it does not get in 
the neighborhood of a branching point. In this paper we focus on the latter aspect of 
evolutionary trajectories. 



AD was first developed, in the wake of Hofbauer and Sigmund (1987); Marrow, Law and 



Cannings (1992); Metz, Nisbet and Geritz (1992), as a systematic framework at a physicist 



level of rigor by Diekmann and Law Dieckmann and Law (1996) and by Metz and Geritz 



and various coworkers Metz, Nisbet and Geritz (1992); Metz et al. (1996); Geritz et al. 



(1998). The first two authors started from a Lotka-Volterra style birth and death process 



while the intent of the latter authors was more general, so far culminating in Durinx, 



Metz and Meszena (2008). The details for general physiologically structured populations 



were worked out at a physicist level of rigor in Durinx, Metz and Meszena (2008) while 



the theory was put on a rigorous mathematical footing by Champagnat and Meleard and 



coworkers Champagnat, Ferriere and Meleard (2008); Champagnat (2006); Meleard and 



Tran (2009), and recently also from a different perspective by Peter Jagers and coworkers 



Klebaner et al. (2011). All these papers deal only with clonal models. In the meantime a 



number of papers have appeared that deal on a heuristic basis with special models with 



Mendelian genetics (e.g. 


Kisdi and Geritz 


1999 


; Van Dooren 


(1999 




2000 


); Van Doom 


and Dieckmann ( 


2006); 


Proulx and Phillips 


(2006 


); 


Peischl and Burger 


(2008)), while the 



general biological underpinning for the ADs of Mendelian populations is described in Metz 



( in press ) . In the present paper we outline a mathematically rigorous approach along the 



path set out in Champagnat, Ferriere and Meleard (2008); Champagnat (2006), with proofs 



for those results that differ in some essential manner between the clonal and Mendelian 



cases. 



and contrary to the treatment in Metz (in press) we deal still only with the single locus 



models in Kisdi and Geritz 


Peischl and Burger 


( 


2008) 



infinite allele case (cf Kimura Kimura (1965)), while deferring the infinite loci case to a 
future occasion. 

Our reference framework is a diploid population in which each individual's ability to survive 
and reproduce depends only on a quantitative phenotypic trait determined by its genotype, 
represented by the types of two alleles on a single locus. Evolution of the trait distribution 
in the population results from three basic mechanisms: heredity, which transmits traits 
to new offsprings thus ensuring the extended existence of a trait distribution, mutation, 
generating novel variation in the trait values in the population, and selection acting on these 
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trait values as a result of trait dependent differences in fertility and mortality. Selection is 
made frequency dependent by the competition of individuals for limited resources, in line 
with the general ecological spirit of AD. Our goal is to capture in a simple manner the 
interplay between these different mechanisms. 



2 The Model 



We consider a Mendelian population and a hereditary trait that is determined by the two 
alleles on but a single locus with many possible alleles (the infinite alleles model of Kimura 



Kimura (1965)). These alleles are characterized by an allelic trait u. Each individual 
i is thus characterized by its two allelic trait values (u\,u 2 ), hereafter referred to as its 
genotype, with corresponding phenotype (j)(u\,U2), with 4> '■ ^ m ~~ > m order to keep 
the technicalities to a minimum we shall below proceed on the assumption that n = m = 1 . 
In the Discussion we give a heuristic description of how the extension to general n and m 
can be made. When we are dealing with a fully homozygous population we shall refer to 
its unique allele as A and when we consider but two co-circulating alleles we refer to these 
as A and a. 

We make the standard assumptions that (j) anci an other coefficient functions are smooth 
and that there are no parental effects, so that (j)(ui, U2) = 4>(u2,ui), which has as immediate 
consequence that if u a = ua+Ci I CI ^ lj then 4>(ua, u a ) = 4>(ua, UA)+d2<ft(uA, ^yl)C+0(C 2 ) 
and 4>(u a ,u a ) = 4>{ua,ua) + 282(P(ua,ua)C + 0(C 2 )> i.e., the genotype to genotype map 
is locally additive, 4>( U A, u a ) ~ (<J)(ua, ua) + <ft(u a , u a )) /2, and the same holds good for all 
quantities that smoothly depend on the phenotype. 



Remark 2.1 The biological justification for the above assumptions is that the evolution- 
ary changes that we consider are not so much changes in the coding regions of the gene 
under consideration as in its regulation. Protein coding regions are in general preceded 
by a large number of relatively short regions where all sorts of regulatory material can 
dock. Changes in these docking regions lead to changes in the production rate of the gene 
product. Genes are more or less active in different parts of the body, at different times 
during development and under different micro-environmental conditions. The allelic type 
u should be seen as a vector of such expression levels. The genotype to phenotype map (ft 
maps these expression levels to the phenotypic traits under consideration. It is also from 
this perspective that we should judge the assumption of smallness of mutational steps 
the influence of any specific regulatory site among its many colleagues tends to be relatively 
minor. 



The individual-based microscopic model from which we start is a stochastic birth and death 
process, with density-dependence through additional deaths from ecological competition, 
and Mendelian reproduction with mutation. We assume that the population's size scales 
with a parameter K tending to infinity while the effect of the interactions between indi- 
viduals scales with j^. This allows taking limits in which we count individuals weighted 
with j^. As an interpretation think of individuals that live in an area of size K such that 
the individual effects get diluted with area, e.g. since individuals compete for living space, 
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with each individual taking away only a small fraction of the total space, the probability 
of finding a usable bit of space is proportional to the relative frequency with which such 
bits are around. 



2.1 Model setup 



The allelic trait space U is assumed to be a closed and bounded interval of R. Hence 
the phenotypic trait space is compact. For any (1x1,1x2) £ U 2 , we introduce the following 
demographic parameters, which are all assumed to be smooth functions of the allelic traits 
and thus bounded. Moreover, these parameters are assumed to depend in principle on the 
allelic traits through the intermediacy of the phenotypic trait. Since the latter dependency 
is symmetric, we assume that all coefficient functions defined below are symmetric in the 
allelic traits. 



f{u\,U2) G M + : the per capita birth rate (fertility) of an individual with genotype (ui, 1x2)- 

D(m, 112) G M + : the background death rate of an individual with genotype (u\, U2). 

K G N : a parameter scaling the per capita impact on resource density and through that 
the population size. 

<?((m,M2Mtti,u2)) £ ]g> + ■ the competitive effect felt by an individual with genotype (u±,U2) 
from an individual with genotype (y\,v?i). The function C is customarily referred to 
as competition kernel. 

fix G the mutation probability per birth event (assumed to be independent of the 
genotype). The idea is that fix is made appropriately small when we let K increase. 

a > 0: a parameter scaling the mutation amplitude. 

m a (u,h)dh = ^m(u, ^)dh: the mutation law of a mutant allelic trait u + h from an 
individual with allelic trait u, with m(u, h)dh a probability measure with support 
[-1, l]C]{h\u + heU}. As a result the support of m a is of size < 2a. 



Notational convention: When only two alleles A and a co-circulate, we will use the 
shorthand: 

fAA = f{u A ,u A ) ; fAa = f{u A ,u a ) ; f aa = f{u a ,u a ) J D AA = D(u A ,u A ) ; 
C((ua, u a ), (u A , u A )) = C Aa , AA ; etc. 



To keep things simple we take our model organisms to be hermaphrodites which in their 
female role give birth at rate / and in their male role have probabilities proportional to / 
to act as the father for such a birth. 

We consider, at any time t > 0, a finite number Nt of individuals, each of them with 
genotype in U 2 . Let us denote by {u\ , u\) , ■ ■ ■ , (u^* , u^* ) the genotypes of these individuals. 



4 



The state of the population at time t > 0, rescaled by K, is described by the finite point 
measure on U 2 

K 1 Nt 

i=l 

where Sr Ul)U2 \ is the Dirac measure at (u\,U2)- 

Let (u, g) denote the integral of the measurable function g with respect to the measure v 
and Supp(V) the support of the latter. Then (u^' K , 1) = ^ and for any (m, u 2 ) G IA 2 , the 
positive number {v°' K , ,« 2 )}) * s cane d the density at time t of genotype (iti, 1 ^). 

Let A^i? denote the set of finite nonnegative measures on IA 2 , equipped with the weak 
topology, and define 



M 



An individual with genotype (u\, 112) in the population v"' K reproduces with an individual 
with genotype {u{,u{) at a rate f{ui,u 2 ) ^^Kjy 

With probability 1 — Hk(ui, U2) reproduction follows the Mendelian rules, with a newborn 
getting a genotype with coordinates that are sampled at random from each parent. 

At reproduction mutations occur with probability /j>k(ui, U2) and then change one of the 
two allelic traits of the newborn from u to u + h with h drawn from m a {u, h)dh. 

Each individual dies at rate 



1 N t 

D(ui,u 2 ) + C* i/f' (ui,u 2 ) = D(u 1 ,u 2 ) + — y]C((ui,w); (u{,u 3 2 )) 



3=1 

The competitive effect of individual j on an individual i is described by an increase of 
c ((u 1 ,u 2 ),(u 1 ,u 2 )) o £ ^ e latter's death rate. The parameter K scales the strength of compe- 
tition: the larger K, the less individuals interact. This decreased interaction goes hand in 
hand with a larger population size, in such a way that densities stay well-behaved. Ap- 
pendix [A] summarizes the long tradition of and supposed rationale for the representation 
of competitive interactions by competition kernels. 

For measurable functions F : M — > M. and g : U 2 — > R, g symmetric, let us define the 
function F g on M K by F g (y) = F((v,g)). 

For a genotype {u\,v,2) and a point measure v, we define the Mendelian reproduction 
operator 

AF g (v,u\,uiu{,ui) = ^F[{v,g) + ^g(u\,u{)^ +f((v, 9 ) + ^K,4)) 

+F((u,g) + ^g(4,u{)) + F((v,g) + ^g(ui,4))\ - F g (u), (2.2) 
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and for m(u, h)dh a measure on R parametrized by u, we define the Mendelian reproduction- 
cum-mutation operator 

MF g (u,u\, 4,4,4) (2.3) 
= U {( F ((^9) + y{v\ + h,4))+F[{v,g) + y{v\+h,u{))) m a (u\,h) 

+ (f({v, 9 ) + ±g(4 + + ^(V,<7> + h{4 + h,4)^m a (uih) 

+ (F({v,g) + ^g(u\,4 + fc)) + + ^(t4,«i + /i))) m^./i) 

+ (>(V,<7> + + &)) + f((",9) + ^(«2»«2 + h)))m c {4,h))dh - F g {v). 

(2.4) 

The process {v° ,K ,t > 0) is a .M^-valued Markov process with infinitesimal generator 
defined for any bounded measurable functions F g from M K to R 

and 1/ = ^ E?=i *(uj.»4) by 

f(v j v j ) ■ ■ 

+E(i-M<4,4)) E ^ u 'i-^i(irn ^(",«i.t4,«i, *4) 
1=1 j=ij#» ^ ' /; 

+ J2^(4,4) E f(4Ar-^^MF g (u,4,4,4,4). (2.5) 
i=i .7=1,7^ ^w/; 



The first term describes the deaths, the second term describes the births without mutation 
and the third term describes the births with mutations. (We neglect the occurrence of 
multiple mutations in one zygote, as those unpleasantly looking terms will become negligi- 
ble anyway when hk goes to zero.) The density-dependent non-linearity of the death term 
models the competition between individuals and makes selection frequency dependent. 

Let us denote by (A) the following three assumptions 

(Al) The functions /, D, fix and C are smooth functions and thus bounded since U is 
compact. Therefore there exist f,D,C < +00 such that 

</(■)</, <£>(■)< D, 0<C(-,-)<C. 

(A2) r{u\,U2) = f {111,112) — D{u\,U2) > for any (1x1,1x2) £ U 2 , and there exists C > 
such that C < C(; ■). 

(A3) For any a > 0, there exists a function fh a : R — > R + , J fh a (h)dh < 00, such that 
m a (u, h) < m a (h) for any u G U and ftsl. 
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For fixed K, under (Al) and (A3) and assuming that K((i>q' K , 1)) < oo, the existence 
and uniqueness in law of a process on D(R+, Ai ) with infinitesimal generator L K can be 



adapted from the one in Fournier-Meleard Fournier and Meleard (2004) or Champagnat, 



Ferriere and Meleard (2008). The process can be constructed as solution of a stochastic 
differential equation driven by point Poisson measures describing each jump event. As- 
sumption (A2) prevents the population from exploding or going extinct too fast. 

3 The short term large population and rare mutations limit: 
how selection changes allele frequencies 

In this section we study the large population and rare mutations approximation of the 
process described above, when K tends to infinity and fix tends to zero. The limit becomes 
deterministic and continuous and the mutation events disappear. 



The proof of the following theorem can be adapted from Fournier and Meleard (2004). 



Theorem 3.1 When K tends to infinity and if Uq converges in law to a determinis- 
tic measure v$, then the process (u a,K ) converges in law to the deterministic continuous 
measure-valued function [y%,t > 0) solving 



+g(u 2 ,v{) + g(u 2 ,v 2 ))) \ds. 



Below we have a closer look at the specific cases of genetically mono- and dimorphic initial 
conditions. 

3.1 Monomorphic populations 

Let us first study the dynamics of a fully homozygote population with genotype (ua,ua) 
corresponding to a unique allele A and genotype AA. Assume that the initial condition 

is Nq 0t UAjUA \, with converging to a deterministic number no > when K goes to 
infinity. 

In that case the population process is N^8i UAjUA -\ where is a logistic birth and death 
process with birth rate /aa = f( u A,UA) an d death rate Daa + Caj ^ aa . The process 



K 



N" 

\~^k~it — 0) converges in law when K tends to infinity to the solution (n(t),t > 0) of the 
logistic equation 

dn 

— (t) = n{t) (f AA - D AA - C AA ,AAn{t)), (3.1) 

with initial condition n(0) = no- This equation has a unique stable equilibrium equal to 
the carrying capacity: 

fAA ~ D A a /„ n , 

nAA = • (3.2) 

^AA,AA 
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3.2 Genetic dimorphisms 

Let us now assume that there are two alleles A and a in the population (and no mu- 
tation). Then the initial population has the three genotypes AA, Aa and oo. We use 
i.^AAti^AaV^aat) t° denote the respective numbers of individuals with genotype AA, 
Aa and aa at time t, and (Naa, ^Aa, N aa ) to indicate the typical state of the population. 
Let 

fAANAA + fAaN Aa /2 

p = 

fAANAA + fAa^Aa + faaN aa 

be the relative frequency of A in the gametes. Then the population dynamics t \— > 

(Naa f ^Aa v ^aa t) ls a birth and death process with three types and birth rates bAA, bAa, b a , 

and death rates (Iaa, dAa: d aa defined as follows. 

OAA = {fAANAA + \ fAaNAa)P 

{fAANAA+\fAaN A a) 2 
~ fAAN A A + fAaY + .faaN aa ' 

b A a= UAANAA + \fAaNAa){l-p) + {faaNaa + \fAaN A a)p 

9 (fAANAA + yAaNAa)(f«aN aa + ±f Aa N Aa ) (3.3) 
fAAN A A+fAaN A a+faaN aa ' 

baa = UaaN aa + \f A aN A a) (1 " p) 

UaaN aa +y A aNAa? 



fAANAA+fAa.NAa+faaN a a ' 

CAA,AA NaA+ CAA,Aa N Aa +CAA,aa N aa 

K 

CAa,AA NaA+ CAa,Aa NAa +CAa,aa N aa 
K 

C a a,AA N AA+ Caa,Aa NAa +C a a,aa N a a 



d>AA = 


{D AA + 


d-Aa = 


{DAa + 


daa — 


[D aa + 



K J 



Naa, 

N Aa , (3.4) 



To see this, it suffices to consider the generator (2.5 ) with = 0; for instance, K (u, f) = 

fAANAA + fAa^Aa + faaN aa . 

Proposition 3.2 Assume that the initial condition K~ 1 (N AA Q , N Aa , N^ a ) converges 
to a deterministic vector (xq,uq,zq) when K goes to infinity. Then the normalized pro- 
cess K~ 1 (N AAt ,N Aat ,N^ at ) converges in law when K tends to infinity to the solution 
(x(t),y(t),z(t)) = tpt(xo,y ,z ) of 

= X(x(t),y(t),z(t)) , (3.5) 

V *(*) / 

where 




b A A(x, y, z) - (1aa(x, y, z) 
X{x,y,z)=\ b A a(x,y,z) - d Aa (x,y,z) | , (3.6) 

baa(x,y,z) - d aa {x,y,z) 



with 



b A A(x,y,z) 



(fAAX + \ fAa){fAAX + hfAaV) 



fAAX + fAaV + faaZ 

(1aa(x, y, z) = {D A a + C A a,aa x + C A A,Aa V + C A A, aa z) x 
and similar expressions for the other terms. 
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Due to its special functional form, the vector field X has some particular properties. We 
summarize some of them in the following Propositions. 



Proposition 3.3 The vector field (3.6) has two fixed points (uaa,0, 0) and (0,0, n aa ) (de- 
noted below by AA and aa ) where 

j " AA ~ DAA , _ faa ~ D aa 

riAA = — -p. ; and n a 



^AA,AA ^aa,aa 

The (3 x 3) Jacobian matrix DX(AA) has the eigenvalues —/aa + Daa (negative by as- 
sumption (A2)), -C aa ,AA n AA - D aa < 0, and 

SAa,AA = fAa - T>Aa ~ Ca<i,AA n A A- 
An analogous result holds for DX{aa). 

This result follows from a direct computation left to the reader. 
As we will see later on, the eigenvalue S^ a 

t AA will play a key role in the dynamics of trait 
substitutions. It describes the initial growth rate of the number of Aa individuals in a 
resident population of AA individuals and is called the invasion fitness of an Aa mutant 
in an AA resident population. It is a function of the allelic traits u a and u a - 

Notation: When we wish to emphasize the dependence on the two allelic traits (uA,u a ), 
we use the notation 

c Q( \ ft \ -n( \ nil w y J(u A ,u A ) - D(u A ,u A ) 

ZAa,AA = b{u a ]UA) = j{UA,U a )-D(UA,U a )-C({UA,U a ), (ua,Ua))-^t; ^ rr~. 

C{{Ua,Ua),{ua,Ua)) 

(3-7) 

Note that the function S is not symmetric in ua and u a and that moreover 

S(u A ;u A ) = 0. (3.8) 

In Appendices [B] and [C] the long term behavior of the flow generated by the vector field 



(3.6) is analyzed in more detail. The main conclusions are: 



Proposition 3.4 First consider the case when the mutant and resident traits are precisely 
equal. Then the total population density goes to a unique equilibrium and the relative 
frequencies of the genotypes go to the Hardy -Weinberg proportions (p 2 ,p(l — p, (1 — p) 2 ), 
i.e., there exists a globally attracting one- dimensional manifold filled with neutrally stable 
equilibria parametrized by p, with as stable manifolds the populations with the same p. 

For the mutant and resident sufficiently close, this attracting manifold transforms into 
an invariant manifold connecting the pure resident and pure mutant equilibria. When 
Sao,,AA > the pure resident equilibrium attracts only in the line without any mutant al- 
leles and its local unstable manifold is contained in the aforementioned invariant manifold 



(Theorem C.l). When moreover the traits are sufficiently far from an evolutionarily sin- 
gular point (defined by diS(uA',u A ) = 0) the movement on the invariant manifold is from 
the pure resident to the pure mutant equilibrium, and any movement starting close enough 



to the invariant manifold will end up in the pure mutant equilibrium (Theorem C.2) 
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4 The long term large population and rare mutations limit: 
trait substitution sequences 



In this section we generalize the clonal theory of adaptive dynamics to the diploid case. We 
again make the combined large population and rare mutation assumptions, except that we 
now change the time scale to stay focused on the effect of the mutations. Recall that the 
mutation probability for an individual with genotype (u±, u?) is [Xk G (0, 1]. Thus the time 
scale of the mutations in the population is K ^ K ■ We study the long time behavior of the 
population process in this time scale and prove that it converges to a pure jump process 
stepping from one homozygote type to another. This process will be a generalization of 
the simple Trait Substitution Sequences (TSS) that for the haploid case were heuristically 
derived in Dieckmann and Law ( !996| ), and Metz et al. (1996) where they were called 
'Adaptive Dynamics', and rigorously underpinned in Champagnat (2006), Champagnat 



and Meleard (2011) 



Let us define the set of measures with single homozygote support. 



Mt 



n AA S 



(ua,ua) ' 



ua^U and uaa equilibrium of (3.1) 



We will denote by J the subset of U where diS(u;u) vanishes. We make the following 
hypothesis. 



Hypothesis 4.1 For any u G J we have 

d 



du 



diS(u;u) /0 



This hypothesis implies that the zeros of d\S(u;u) are isolated (see Dieudonne (1969)), 
and since U is closed and compact, J is finite. 



Definition 4.2 The points u* G U such that diS(u*;u*) = are called evolutionary sin- 
gular strategies (ess). 



Note that because of (3.8) 



d 2 S(u*;u*) = = 0. 



Let us now define the TSS process which will appear in our asymptotic. 



Definition 4.3 For any a > 0, we define the pure jump process (Z° , t > 0) with values in 
U, as follows: its initial condition is ua_ and the process jumps from ua to u a = ua + h 
with rate 

tt \ - [S(uA + h;u A )]+ ( M . . 

f[UA,UA)nAA—r, ——m a (uA,h)dh. (4.1) 

f{UA,u A + h) 
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Remark 4.4 Under our assumptions, the jump process Z a is well defined on M + . Note 
moreover that the jump from ua to u a only happens if the invasion fitness S(u a ; ua) > 0. 



We can now state our main theorem. 



Theorem 4.5 Assume (A). Assume that z/q' = -jffi(u Ao ,u Ao ) with -jf converging in law 
to n Ao A uniformly bounded in L l and such that diS(uA ,UA ) ^ 0. (That is, the initial 
population is monomorphic for a type that is not an ess). Assume finally that 

VK>0, ^— < — — < exp(VK), asK^oo. (4.2) 



For 77 > introduce the stopping time 



T°> K = inf ( 



l>0 , '*< K g - ,,:n 

\ U t/K m ^ 



where d is the distance on the allelic trait space. 
Extend A4 p with the cemetery point d. 



Then there exists 00(77) > such that for allO < a < 00(77), the process 1 r T ^,K >t \ + 

dtrrptr.K x ;t > 0) converges (in the sense of finite dimensional distributions on A4p 
equipped with the topology of the total variation norm) to the Ai^-valued Markov pure 
jump process (Af;i > 0) with 

A r = n{ z t)S(Zf,z?)i{T°>t} + dl{ T * <t y, 

where 

T% = in£{t>0,d(Z?,J) <r,}. 
The process (A%;t > 0) is defined as follows: Aq = n Ao Ao^(u Ao ,u Ao ) an< ^ ^ i um P s 

from n A ,A$(u A ,u A ) to n ^(u a ,u a ) 



with u a = ua + h and infinitesimal rate (4.1). 



Remark 4.6 Close to singular strategies the convergence to the TSS slows down. To 
arrive at a convergence proof it is therefore necessary to excise those close neighborhoods. 
This is done by means of the stopping times T%' K and T°: we only consider the process 
for as long as it stays sufficiently far away from any singular strategies. Assumptions (A) 
imply that the thus stopped TSS (Z?)t is weu defined on M + . Since its jump measure is 
absolutely continuous with respect to the Lebesgue measure, it follows that T° converges 
almost surely to 00 when 77 tends to (for any fixed > 0). 



We now roughly describe the successive steps of the mutation, invasion and substitution 
dynamics making up the jump events of the limit process, following the biological heuristics 
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of Dieckmann and Law (1996); Metz et al. (1996); Metz (in press). The details of the proof 



are described in Appendix [D based on the technical Appendices [B] and |Cj 



The time scale separation that underlies the limit in Theorem 4.5 both simplifies the 
processes of invasion and of the substitution of a new successful mutant on the population 
dynamical time scale and compresses it to a point event on the evolutionary time scale. 
The two main simplifications of the processes of mutant invasion and substitution are the 
stabilization of the resident population before the occurrence of a mutation, simplifying 
the invasion dynamics, and the restriction of the substitution dynamics to a competition 
between two alleles. In the jumps on the evolutionary time scale t/K/ix these steps occur 
in opposite order. First comes the attempt at invasion by a mutant, then, if successful, 
followed by its substitution, that is, the stabilization to a new monomorphic resident 
population. After this comes again a waiting time till the next jump. 

To capture the stabilization of the resident population, we prove, on the assumption that 
the starting population is monomorphic with genotype AA, that for arbitrary fixed e > 
for large K the population density (i>°' , with high probability stays in the 

e-neighborhood of uaa until the next allelic mutant a appears. To this aim, we use large 



deviation results for the exit problem from a domain (Freidlin and Wentzel (1984)) already 



proved in Champagnat (2006) to deduce that with high probability the time needed for the 



population density to leave the e-neighborhood of haa is bigger than exp(VK) for some 
V > 0. Therefore, until this exit time, the rate of mutation from AA in the population 
is close to ^jlkVAA Jaa Kuaa an d thus, the first mutation appears before this exit time if 
one assumes that 

Hence, on the time scale t/KfiK the population level mutation rate from AA parents is 
close to 

PAA fAA UAA- 



To analyze the fate of these mutants a, we divide the population dynamics of the mutant 



alleles into the three phases shown in Fig. 4.1 in a similar way as was done in Champagnat 



(2006). 



In the first phase (between time and t\ in Fig. 4.1 ), the number of mutant individuals of 
genotype Aa or aa is small, and the resident population with genotype AA stays close to its 
equilibrium density tiaa- Therefore, the dynamics of the mutant individuals with genotypes 
Aa and aa is close to a bi-type birth and death process with birth rates fAaU + 2faaZ and 
and death rates (Dao, + C Aa,AA^AA) V and (D aa + C aa ,AAnAA) z for a state (y, z). If the 
fitness SAa;AA is positive (i.e. the branching process is super-critical), the probability that 
the mutant population with genotype Aa or aa reaches Ke > at some time t\ is close 
to the probability that the branching process reaches Ke > 0, which is itself close to its 

r q I 

survival probability Al / + when K is large. 

JAa 

Assuming the mutant population with genotype Aa or aa reaches K e > 0, a second phase 



starts. When K — > +oo, the population densities ((i/j 



1 



{aa})) 

are close to the solution of the dynamical system (3.5) with the same initial condition, on 



^-{AA})A^t' K ^{Aa})A v t ,K : 



any time interval [0, T]. The study of this dynamical system (see Appendices |B] and [C|) 
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Figure 4.1: Simulation of the three phases of mutant invasion. 



implies that, if the mutation step u a — ua is sufficiently small, then any solution to the 
dynamical system starting in some neighborhood of (fiAA, 0, 0) converges to the new equi- 
librium (0, 0, n aa ) as time goes to infinity. Therefore, with high probability the population 
densities reach the e-neighborhood of (0,0, n aa ) at some time t%. Applying the results in 
Theorems | C . 1 1 and C.2 for the deterministic system to the approximated stochastic process, 



is justified by observing that the definition of the stopping times T%' K and T° implies that 
the allelic trait ua stays at all times away from the set J. 

Finally, in the last phase, we use the same idea as in the first phase: since (0,0, n aa ) is a 
strongly locally stable equilibrium, we can approximate the densities of the traits AA and 
Aa by a bi-type sub-critical branching process. Therefore, they reach in finite time and 
the process comes back to where we started our argument (a monomorphic population), 
until the next mutation. 



In Champagnat and Meleard (20111 it is proved that the duration of these three phases is 
of order — — . Therefore, under the assumption 



log K < 



a 



K 
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the next mutation occurs after these three phases with high probability. Then the time 
scale Assumption (4.2) allows us to conclude, taking the limits K tending to infinity and 
then e to 0. Then we repeat the argument using the Markov property. 



Note that the convergence cannot hold for the usual Skorohod topology and the space 
A4f equipped with the corresponding weak topology. Indeed, it can be checked that the 
total mass of the limit process is not continuous, which would be in contradiction with the 
C-tightness of the sequence (^j^ >i > 0), which would hold in case of convergence in 
law for the Skorohod topology (since the jump amplitudes are equal to and thus tend 
to as K tends to infinity). 

However, certain functionals of the process converge in a stronger sense. Let us for example 
consider the average over the population of the phenotypic trait <j). This can be easily 
extended to more general symmetric functions of the allele. 



Theorem 4.7 Assume that u 



(it, u) is strictly monotone. Define 



r;;* = inf < 



t>0,d 



I cr.K -I \ 



where = {<f>(u,u);u £ J}. 



Under the assumptions of Theorem 4-5 the process 



(R? K ,t>0) 



W&k'+K f>(] 

Wk a {T **- tv 

\ t/ K^ix 1 



converges in law in the sense of the Skorohod Mi topology to the process (<p(Z^ Z[)lfx a >t}- 
t > 0) where = inf {t > 0, d (<f>(Z° , Z? ), J ) < r,}. 



The Skorohod Mi topology is a weaker topology than the usual 3\ topology, allowing 



processes with jumps tending to to converge to processes with jumps (see Skorohod 



(1956)). For a cad-lag function x on [0,T], the continuity modulus for the Mi topology is 
given by 

w$(x) = sup d(x(t),[x(ti),x(fa)]). 

0<ti<t<t 2 <T; 
0<t 2 -ti<5 

Note that if the function x is monotone, then wg(x) = 0. 



Proof From the results of Theorem 4.5 
tributions of (R^' K ,t > 0) converge to those of ((j)(Z^,Z^ 
Theorem 3.2.1, it remains to prove that for all r\ > 0, 



it follows easily that finite dimensional dis- 
,t > 0). By 



Skorohod 



(1956) 



lim limsupP^^-R^) > ry) = 0. 

K-^oo 
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The rate of mutations of (R° ,K , t <T) being bounded, the probability that two mutations 
occur within a time less that 5 is o(5). It is therefore enough to study the case where there 



is at most one mutation on the time interval [0,5]. As in the proof of Proposition 3.2 
with probability tending to 1 when K tends to infinity, the process (R°' K ,t > 0) is close 
to FwAt/ 'K hk) where F^y, is defined by 



and 



(fgt(Mo) , W ) 
(<Pt(M ) , 1) 

c/)(ua,ua) 

4>(uA,U a ) 



\U n , U n 



Recall that (ft is the flow defined by the vector field (see Proposition 3.2). Away from 



invading mutations, the function Fwa is constant and the modulus of continuity tends to 
0. Around an invading mutation, it follows from Corollary C.4 that the function Fyy^ is 
monotone. Therefore the same conclusion holds. 



□ 



5 Small mutational steps - the time scale of the canonical 
equation 

We are now interested to study the convergence of the TSS when the mutation amplitude a 
tends to zero. Without rescaling time, the TSS trivially tends to a constant. In order to get 
a nontrivial limit, we have to rescale time adequately, namely with \, since S(ua] ua) = 0- 

Theorem 5.1 Assume that the initial values Zq are uniformly bounded in I? and that 
they converge to Z® as a tends to 0. Then, the sequence of processes (Z^, 2 ,t > 0) tends in 
law in D([0,T],IR) to the deterministic (continuous) solution (u(t),t > 0) of the canonical 
equation 

d f 

—u(t)=f(u(t),u(t))n(u(t)) / h[hd 1 S(u(t);u(t))]+m(u(t),h)dh, (5.1) 
dt ./to 



where 



n{u) 



f(u, u) — D(u, u) 



C((u,u)Au,u)) 

The proof of this theorem is similar to the proof of Theorem 4.1 in Champagnat and 



Meleard| fl2011| ). 

In this general form the canonical equation is still of little practical use, although already 
some qualitative conclusions can be drawn from it. The trait increases whenever the 
fitness gradient d\S(u;u) is positive and decreases when it is negative, i.e., movement is 
always uphill with respect to the current allelic fitness landscape S(-;u). The equilibria 
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of (5.1) correspond to the allelic evolutionarily singular strategies, except that close to 



those strategies (5.1 ) is no longer applicable since in their neighborhood the convergence of 



the underlying individual-based process to the simple TSS becomes slower and slower. So 



all we can deduce from the canonical equation (5.1) is that for small mutational steps the 



trait substitution sequence will move to some close neighborhood of an allelic evolutionarily 
singular strategy. 

Remark 5.2 If we had considered extended TSSes taking values in the powers of the trait 



space as is done in Metz et al. (1996), the convergence to the canonical equation would 



similarly have gone awry due to a slowing down of the convergence near evolutionarily sin- 
gular strategies, and the occurrence of polymorphism close to some of them, with adaptive 
branching as a particularly salient example; branching can only be investigated with a time 



scaling different from the one for the canonical equation Metz et al. (1996); Champagnat 



and Meleard (2011). 



To get from the previous observation to some biological conclusion we need to decompose 
the genotypic fitness function S into its ecological and developmental components 



SAa. 



AA = S(<f> Aa ; <Paa) = f{(f>Aa) ~ D(cpAa) ~ C{<j> Aa , <t> AA ) f ^^¥ 1 : 



4>Aa= 4>( u A,U a ), 4>AA = 4>( u A,Ua), f{<l>Aa)=f{u A ,U a ), etc. 



and 



(5.2) 



(5.3) 



d\S{u; u) = diS((p(u; u); 4>(u; u))d\4>{u, u). 

Hence, the allelic singular strategies are of two different types, ecological, characterized 
by S((p(u;u); 4>(u;u)) = 0, and developmental, characterized by d\(j){u,u) = 0. On the 



phenotypic level the latter are perceived as developmental constraints (c.f. |Van Dooren 



(2000)). 



To arrive at quantitative conclusions we have to make additional assumptions about the 
within individual processes. One often used assumption is that the mutation distribution 



is symmetric. With that assumption (5.1) reduces to 



ju(t) = ^n( U (t))V a ( U (t))diS(u(t);u(t)), 



(5.4) 



with V a the allelic mutational variance. (The factor | comes from the fact that the inte- 
gration is only over a half-line.) This equation can easily be lifted to the phenotypic level 
as 



L 
it 



U(t) = n(U(t))V p {U(t))diS(U(t);U(t)), 



(5.5) 



with U = <p(u, u) and V p the phenotypic mutational variance, an equation fully phrased in 
population level observables. The factor ^ is canceled by a factor 2 coming from the fact 
that the fitness S refers to heterozygotes with only one mutant allele, while after a substi- 
tution the other allele is also a mutant one. For this equation only the ecological singular 
strategies remain while developmental constraints appear in the form of V p becoming zero 



(c.f. Van Dooren (2000)). (It is also possible to lift (5.1) to the phenotypic level. However, 



the truncated first and second moments that appear in the resulting expression are no 
longer well-established statistics that can be measured independent of any knowledge of 
the surrounding ecology.) 
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6 Discussion 



This paper forms part of a series by a varied collection of authors that aim at putting 



the tools of adaptive dynamics on a rigorous footing Metz, Nisbet and Geritz ( 


1992 


); 


Dieckmann and Law 


(1996); Metz et al. 


(1996); 


Geritz et al. 


1998 


; Champagnat, Ferriere 


and Meleard 


(2008); 


Champagnat (2006 


) ; |Durinx, Metz and Meszena I 


2008); I 


vleleard and 


Tran (2009); 


Champagnat and Meleard (|2011 


; Met; 


5 (in press); 


Klebaner et al. ( 


2011 


); 


Bovier and Champagnat (in preparation) (see also ' 


Diekmann et al. 


(2005); 


Barles and 


Perthame (2007); Carrillo, Cuadrado and Perthame ( 


2007); 


Desvillettes et al. 


(200£ 


i)). It 



is the first in the series to treat the individual-based justification of the adaptive dynamics 
tools in a genetic setting. As such it forms the counterpart of the more heuristic, but 
also more general Metz (in press). We only consider unstructured Lotka-Volterra type 



populations and sing 


,ie locus genetics, in line with applied papers such as Kisdi and Geritz 


(1999 


); Van Dooren 


(1999 


); Proulx and Phillips 


( 


2006); 


Peischl and Burger 


( 


2008 


). For 



such models we proved the convergence (for large population sizes and suitably small 
mutation probabilities) of the individual-based stochastic process to the TSS of adaptive 
dynamics, and the subsequent convergence (for small mutational steps) of the TSS to 
the canonical equation. Not wholly unexpectedly, the results are in agreement with the 
assumed framework of the more applied work. Yet, to arrive at a rigorous proof new 
developments were needed, like the derivation of a rigorous estimate for the probability 
of invasion in a dynamic diploid population (Appendix [D]) , a rigorous, geometric singular 
perturbation theory based, invasion implies substitution theorem (appendix [C]), and the 
use of the Skorohod Mi topology to arrive at a functional convergence result for the TSS 
(Section Q. 

Below we list the remaining biological limitations of the present results and the correspond- 
ing required further developments. 

The first limitation is the assumption of an unstructured population. For a a fair number 
of real populations the assumption of random deaths appears to match the observations, 
but no organisms reproduce in a Poisson process starting at birth. Moreover, in nature a 
good amount of population regulation occurs through processes affecting the birth rate, as 
when a scarcity of resources translates in a delay of maturing to the reproductive condition. 



Durinx, Metz and Meszena (2008) heuristically treats very general life histories (although 
only for a finite number of birth states, a finite number of variables channeling the in- 
teraction between individuals, and a deterministic population dynamics converging to a 



unique equilibrium) based on the population dynamical modeling framework of Diekmann 
et al. (1998 |2001 ); [Diekmann, Gyllenberg and Metz (2003). However, it only considers the 
convergence to the canonical equation, starting from the TSS, conjectured to be derivable 
from the population dynamical model, with the goal of relating its coefficient functions to 
observationally accessible statistics of individual behavior. In fact, even the convergence 
to a deterministic population model, as in Theorem |3.1| does not easily fit in the scheme 



of Fournier and Meleard (2004) in the (biologically common) cases where the movement 
of individuals through their state spaces depends directly or indirectly on the population 
size and composition. (The special case where this movement decomposes in a product of 



a population- and a state-dependent term is covered in Tran (2006 2008); Ferriere and 
TV^rTl (|2009|)). 
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A further limitation is that we assumed the trait to be governed by only a single locus 
(in keeping with a well-established tradition starting with Kimura ( 1965| )). The more 
locus case still has to be worked out. The superficially more easy case with infinitely 
many loci, so that no mutant ever occurs on the same locus, is considered from a heuristic 
perspective in Metz (in press); Metz and de Kovel (in preparation). However, the problem 



of rigorously setting up the underlying individual-based model as a limit for models with 
an ever increasing number of loci still needs to be tackled. 

The final extension to be considered is to higher dimensional geno- and phenotypic trait 
spaces. We conclude with a heuristic discussion of the form such an extension will take. 



On the genotypic level the canonical equation will take essentially the same form as (5.1) 



and (5.4), with scalar u, h and d\S replaced by vectors, and the mutational variance by 



a covariance matrix, just as 


this is written in 


Dieckmann and Law ( 


1996 


); 


Champagnat, 


Ferriere and Meleard 


(2008 




Durinx, Metz and Meszena 


(2008 


); 


Champagnat and Meleard 


(2011) for the clonal and 


Metz (in press); Metz and de Kovel (in preparation) for the 



Mendelian case. However, there is one remaining snag, which is the reason why we opted 
for treating only the one-dimensional case. In the directions orthogonal to the selection 
gradient the fitness landscape around the resident strategy has the same shape as at an 
evolutionarily singular strategy. In the one-dimensional case we opted for just removing 
the neighborhoods of the singular strategies. If we were to apply the same strategy for the 
higher dimensional case we would have to remove all residents. The way out is by observing 
that the directions where something awry may occur are but a very small minority among 
all possible directions in which mutations may occur. Heuristic calculations suggest that 
the trouble only occurs in a narrow double horn with a boundary that at the resident 
strategy is tangent to the linear manifold orthogonal to the selection gradient, so that when 
the mutational step size a goes to zero, the probability of a mutant ending up in that horn 
decreases as some higher power of a. Moreover, in the directions orthogonal to the fitness 
gradient the fitness is a quadratic function, making the probability of invasion scale not 
linearly but quadratically with the size of any mutational steps in those directions. The 
main problem with such mutants is that some of them may on the population dynamical 
time scale keep coexisting with the resident. Further heuristic calculations then suggest 
that for such a resident pair the probability of invasion of a subsequent mutant more 
in the direction of the fitness gradient is to the lowest order of approximation - in the 
distance between the two residents - equal to the probability of invasion in a monomorphic 
population of the average type, and that such a mutant ousts both residents. Therefore 
the general (i.e., more type) TSS is close to a simple TSS in which those untoward mutants 
are just removed from the consideration, the smaller the mutational step the closer. We 
put rigorously underpinning this scenario forward as the last of our list of challenges. 
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A A few words about competition kernels 



In the ecological literature the models described in Section [2] are known as Lotka-Volterra 
competition models Lotka (1925); Volterra (1931). The early LV models were all deter- 



ministic, phrased as ODEs corresponding to large population limits such as considered 
in the Section |3j without mutations. The determinism together with the assumption of 
clonal reproduction obviated the need to separately model birth and deaths: competition 
was represented as its overall effect on the population growth rate. The later stochastic 
models, e.g. Dieckmann and Law (1996); Metz et al. (1996), usually put the effect of 



competition only in the death rate, as otherwise the chosen linear form of the interaction 
might lead to negative birth rates. 

The simplest case is when C = 0. This is the case customarily put forward in population 
genetics textbooks as starting point for the derivation of their deterministic models for 
gene frequency change by selection, but for the fact that population geneticists usually 
work in discrete time. The unnatural consequence that the population either will die out 
or will keep growing indefinitely is made invisible by transforming to relative frequencies. 

The more realistic case of non-selective competition, C((u\, 1*2); (^1,^2)) = C(v\, V2), leads 
to the same population genetical equations. The selective pressures on the gene frequencies 
then do not change with the population size or composition as they are caused only by 
differences in the fixed mortalities and fertilities. 

Where in population genetics the early selection models assumed indefinitely growing pop- 
ulations, the early stochastic models, in continuous time the Moran-type models, assumed 
constant population sizes. Although later variable population sizes were introduced, it was 



just assumed that these sizes fluctuated between positive lower and upper bounds Karlin 



(1968); Donnelly and Weber (1985). Stochastic models with the population regulation 
represented in accordance with ecological tradition are relative newcomers (e.g. Metz and 



Redig (in preparation)). 

The case where the additional death rate incurred by an individual from its competitive 
interaction depends only on the genotype of the focal individual and not on that of its com- 
petitors is known in the ecological literature as purely density dependent selection Rough- 



garden 


(1971 


1976 


1979) 



and in the mathematical literature as logistic population reg- 
ulation. This logistic case can be generalized to C((ui,U2); (^1,^2)) = C(ui, U2)C(v\, 1)2), 
when it is not the total density but e.g. the total biomass that determines the felt com- 
petitive effect and different genotypes have different biomasses. A further generalization 
is that population growth is regulated by a finite number of variables, think for example 
of the combination of space and nitrogen depletion: 



C((ui,u 2 ); (vi,v 2 )) = y]Cj(ui,U2)Ci(vi,V2) 



i=i 



The vector (C\, . . . , Cfc) T is known as the impa ct of the individuals o n their environment, 
and the vector (C\, . . . , Ck) as their sensitivity a 



Meszena et al. 



(2006). The latter general- 



ization is evolutionarily richer in that it can allow diversification, which is excluded by the 
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earlier considered kernels. In Durinx, Metz and Meszena (2008) it is shown heuristically 



that close to an evolutionary singular strategy any clonal model evolutionarily behaves 
like a Lotka-Volterra competition model of the above type with k equal to one plus the 
dimension of the trait space. 

The above considerations all come from either ecology or population genetics, and originally 
were phrased for a fixed finite number of types, clonal ones in the ecological and Mendelian 
ones in the population genetics literature. The first model characterizing these types in 



terms of traits was formulated by Robert MacArthur and Richard Levins MacArthur and 



Levins 


(1964 


), see also MacArthur 


1970) 


. This model was later used to great effect 


by a large number of authors (e.g. 


Levinf 


3 (1968); MacArthur and Levins (1967 


); May 


(1973 


1974); 


Roughgarden (1976); Christiansen and Fenchel (1977); Roughgarden 


(1979); 


Slatkin (1980), but see also Roughgarden 


(1989)) to study species packing population 



dynamically as well as evolutionarily. The first genetic model of this type was studied 



by Freddy Bugge Christiansen and Volker Loeschcke 


Christiansen and Loeschcke 


(1980); 


Loeschcke and Christiansen 


(1984 


); 


Christiansen and Loeschcke 


(1987 


) , who considered 



the possibilities for the coexistence of finite numbers of genotypes. Explicit trait-based 
LV-style birth and death process models with mutation only appeared on the scene with 



the birth of adaptive dynamics Dieckmann and Law (1996); Metz et al. (1996) 



The most common assumption in trait-based LV competition models MacArthur and 



Levins (1964); MacArthur (1970 1972); Roughgarden (1979) is that 



C((ui,u 2 ); (vi,V2)) = C((ui,u 2 ); («i,«2)) 



J Q(ui,U2)q((u 1 ,u 2 ); z)Q(v 1 ,v 2 )q(vi,v 2 ; z)dz 
f Q 2 (u 1 ,u 2 )q 2 ((ui,u 2 );z)dz 



Here z G M is customarily interpreted as a trait of a fine-grained self-renewing resource 
with a fast logistic dynamics that is supposed to be non-evolving. That is, it is assumed 
that a resource unit comprises close to infinitely many very small particles, so that the 
resource dynamics can be treated as deterministic and that the turnover of the resource 
is very fast so that it effectively tracks its deterministic equilibrium as set by the current 
consumer population. Functions of (ui,u 2 ) depend again on this argument through <p. Q 
is the average rate constant for the encounter and absorption of resource particles by our 
consumer individuals, expressed in resource units, while q tells how this use is spread over 
the resource axis. 

The most commonly used parametric form is 

f(ui,u 2 ) - £)(«i, u 2 ) =: r(u±,u 2 ) = r, 



r(ui,u 2 ) 



C(ui,u 2 ); (ui,u 2 )) 



-: k(u\,u 2 ) = exp 



((p(ui,u 2 ) - 4>qY 



Q{ui,u 2 )q((ui,u 2 )] z) = exp 



(z - (f>(ui,u 2 )Y 



leading to 



C((ui,u 2 ); (vt,v 2 )) = rexp 



((j>(ui,u 2 ) - 4>(vi,v 2 )) 2 ((j)(ui,u 2 ) - (p ) 



2*2 



+ 
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Deterministic models based on this kernel have all sorts of nice mathematical properties, 
but Adaptive Dynamically they are a bit degenerate in that when a a < the final stop 
for trait substitution sequences that result from the long term large population and rare 
mutations limit, as treated in Section |4j is a Gaussian distribution over trait space (c.f. 
Roughgarden (1979)) whereas for almost any slightly different model the final stop has 
finite support Gyllenberg and Meszena] ( |2005 ); Leimar, Doebeli and Dieckmann (2008). 
For this reason adaptive dynamics researchers started to use slightly modified expressions 
for k or C. (When K is still finite, the number of branches visible in simulations also stays 
finite, due to the early abortion of incipient ones, with the number of recognizable branches 



becoming larger with increasing K and ak/<J a Claessen et al. (2007 2008).) Exploring the 
consequences of all sorts of different competition kernels by now has become a little growth 



industry; a good sample may be found in Doebeli (2011 ). 



Remark A.l The description of the mechanism underlying the competition kernel given 
above was a bit brash, in keeping with biological tradition. Starting from an underlying 
fast logistic resource dynamics actually gives 



f(ui,u 2 ) = y((f>(ui, u 2 )) ( / v((p(ui,u 2 ),z)w(z)k R (z)dz - dl((f>(ui,U2)) 



D(m,u2 

C((ui,u 2 );(vi,v 2 )) = y(x) / v(<f>(ui,u 2 ); z 



d 2 ((p(ui,u 2 )) 

w(z)k R (z] 



v(4>(vi,v 2 );z)dz 



and hence 



Q(ui,u 2 )q((ui,u 2 ); z) = Vv (</>(ui, u 2 ); z 



w(z)k R (z)\ 1/2 



r-R(z) 



with y the yield, i.e., y _1 is the resource mass needed to make one consumer, w the mass of 
a resource unit, v the rate constant of consumers encountering and eating resource units, 
di the rate constant of consumer mass loss due to basal metabolism, and d 2 the consumer 
mortality rate, r R the low density reproductive rate of the resource, and k R its carrying 
capacity. V is some unknown proportionality constant. (In the above terms the time scale 
separation results from both r R and v being very large and y very small with the product 
of y and v being 0(1).) Apparently the interpretation of Q and q is more complicated 
than the standardly attributed one based on the assumption of constant wk R /r R . 

Although time-honoured, the above described mechanistic underpinning is not without 
flaws, as explicitly laid out by Chesson (1990). In the derivation it is assumed that, 



but for the indirect coupling through the consumers, the dynamics of different resources 
are independent. Even very similar resource populations do not compete. However, this 
is only possible if their ecological properties depend everywhere discontinuously on the 
trait z, since the assumed logistic nature of the resource dynamics means that there is 
non-negligible competition between equal resource particles. The alternative assumption 



alluded to by MacArthur MacArthur (1972) that the intrinsic resource dynamics is of a 



chemostat type (as can be approximately the case for seeds from perennial plants) also 
is problematical: Under the reasonable assumption that the resource mass removed by 
a consumer population equals the mass this population acquires, the detrimental effect 
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from competition becomes non-linear in the competitor densities, instead of being simply 
representable by a competition kernel. 



B Properties of the vector field (3.6 ) 
B.l Neutral case. 

We first consider the case of neutrality between the A and a alleles, namely f ai a 2 = /j 
D ai a 2 = D° and C aia2 = D 1 for a\a2,bi 62 = AA, Aa,aa. We have in this case with 
n = x + y + z 

x + y/2 



P 



11 



which is the proportion of allele A. We get for the vector field 
X = 



f(x + y/2)p - (D° + D 1 n)x 
f(x + y/2)(l -p) + f(z + y/2)p - (D° + D x n)y 
f(z + y/2)(l - p) - (D° + D l n)z 



Theorem B.l The vector field Xq has a line of fixed points given by 

r («) 



/ v - 2 n v + n \ 

2 2 
v - ra 



2n 

n v 



v 2 + 2 no v + np 2 



with np = (/ — D°)/D 1 . That is, we have for any v, Xq(Tq(v)) = 0. The parametrization 
with v is chosen such that the differential of the vector field Xq at each point of the curve 
To, DXo(To(v)), has the three eigenvectors 



ei(v) = T (y)) 



I v 2 - 2 n v + n 2 \ 
5no 

2 2 

v -n 
2n 

f + 2 np i> + np 
5no 



e2(u) 



e 3 (u) 



dTp^) 
dv 



d 2 r 



1 2n » 

j) 

n 
f + n 




iw't/i respective eigenvalues D° — f < 0, 0, and — / < 0. The corresponding eigenvectors 
of the transposed matrix DXq(Tq(v)Y , to be denoted by by Px(v), faiv) and j3^{v) can be 
normalized such that for any i,j, £ {1, 2, 3} and any v 

{Pi(v) , ej{v)) = 5 id . 
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Proof This is easily seen by using the standard variables: total population density, 
n = x + y + z, relative frequency of the A allele, p = (x + y/2)/n, and excess heterozygosity 
realtive to the Hardy- Weinberg proportion, h = y/n — 2p(l — p). 

In these new coordinates, the vector field Xq becomes the vector field Yq given by 

/ /- (D°+D 1 n)n 
Y (n,p,h) = 
\ -fh 

This vector field obviously vanishes on the line n = no, h = 0. One gets immediately the 
results by taking v = no (1 — 2p). The spectral results follow by standard computations. 

□ 




B.2 Small perturbations. 



We now assume that mutations are small. We denote by £ the variation of the allelic trait 
( = u a — ua- The vector field depends on C, and will be denoted by X(£, M). We assume 
regularity in ( and M, and observe that X(0, M) = X (M). 



In practice we will apply our results to the vector field (3.6) which has a particular algebraic 



form. It is however convenient to derive the perturbation results in full generality. We will 



come back to the particular case of (3.6) in section C 



From now on, we will assume that the vector field X(£, 
for any x, any z and any £ 



satisfies the following properties 



X x ((, (0,0,z))=X y ((,{0,0,z))=0 



and 



X z ({,(x,0,0)) =X y (C,(x,0,0)) = 0. (B.l) 

This comes from the fact that pure homozygotic populations stay pure homozygotic forever. 

Our goal in this section is to understand the time asymptotic of the flow associated to the 
vector field X((,M). 

Since the curve Tq is transversally hyperbolic (even transversally contracting, see Propo- 



sition 

w. 



B.l) for the vector field Xq, we can apply Theorem 4.1 in Hirsh, Pugh and Shub, 



1977) to conclude that for £ small enough, there is an attracting curve invariant 
by X. Moreover, is regular and converges to Tq when £ tends to zero. In other words, 
there is a small enough tubular neighborhood ~f of To such that for any |£| small enough, 
is contained in 'Y and attracts all the orbits with initial conditions in ~¥ . (For earlier, 
weaker results in this direction for general differential and difference equation population 



dynamical models without genetics see (Geritz et al. 
Appendix B).) 



2002, Dercole and Rinaldi 2008 



Applying Theorem 4.1 in Hirsh, Pugh and Shub, M. (1977) requires that the curve Tq is 



a compact manifold without boundary, but this is not the case here. However one can 
perform some standard surgery to put our problem in this form in a neighborhood of the 
part of Tq which lies in the positive quadrant which is the only part of phase space that 
matters for us. 
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B.2.1 Location of the zeros of the perturbed vector field. 



Since the curve is invariant and (locally) attracting for the flow associated to the vector 
field X(£, M), where M stands for the vector x, y, z) it is enough to study the flow on this 
curve. In particular, since is a curve, if the vector field does not vanish on except 
at the intersections with the lines x = y = and y = z = (the fixed points aa and 



A A respectively see Theorem 3.3), we know that the orbit of any initial condition on 
(between AA and aa) will converge either to AA or to aa. 

We now look for the fixed points on of the flow associated to the vector field X(£,M) 
which are the points where the vector field vanishes. Since is attracting, it is equivalent 
(and more convenient) to look for the fixed points in "V . 

It is convenient to use for this study local frames in the tubular neighborhood "V of To- 
There are many possibilities for defining such frames, we found that a convenient one is to 
represent a point M by the parametrisation 

M(v,r,s) = T (v) +rei(v) + se 3 {v) = (1 + r)T (v) + s d . 

av z 

with v £ [—no — 5, no + 5], r £ [—5, 6], s £ [—5,5] with 5 > to be chosen small enough 
later on. We observe that M(v, 0, 0) = Tq(v). 

The Jacobian of the transformation (v, r, s) \— > (x, y, z) = M(v, r, s) is equal to — (1 + r)/2 
and therefore does not vanish if < 5 < 1. It is easy to verify that if 5 > is small enough, 
the map (v, r, s) i— > M(v, r, s) is a diffeomorphism of [—no — 5, no + 5] x [—5, 5] 2 to a close 
neighborhood of Y (provided this tubular neighborhood is small enough). In particular, 
once 5 > is chosen, for any £ > small enough, "V contains the intersection of with 
the first quadrant (by continuity of in £). 

In order to find the zeros of the vector field X(£, M), we will use convenient linear combi- 
nations of its components which reflect the fact that the flow is transversally hyperbolic. 
We will first equate to zero two linear combinations of the components, and by the im- 
plicit function theorem this will lead to a curve containing all possible zeros. We will then 
look at the points on this curve where the third (independent) linear combination of the 
components vanishes. 



Proposition B.2 For any 5 > small enough, there is a number (q = Co(^) such that for 
any £ £ [— Co>Co] there is a smooth curve 3?t = (r^{v), s^{v)) C M 2 , depending smoothly on 
Q, and converging to when £ tends to zero such that for any v £ [—no — 5, no + 5} we have 

, X(C, M(v, r c (v), Sf («)))) = (f3 3 (v) , X(C, M(v, r ( (v),s ( (v)))) = . 

Moreover, if a point (v, r, s) with v £ [—no — 5, no + 5], r and s small enough is such that 

X((,M(v,r,s))) = (p 3 (v),X(C,M(v,r,s))) = 

then (r, s) = (r^(v), s^(v)). 
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Proof Consider the map F from M 2 x M 2 to M 2 given by 

F(((,v),(r,s)) = (0x(v), X((,M(v,r,s))),(p 3 (v) , X((,M(v,r,s)))) . 

For any vq £ [—no — 5, Uq + 5], and |£| small enough, the differential of F in (r, s) at 
(0, vq, 0, 0) is invertible. This follows by continuity from the same result in £ = where the 
determinant of the differential is /(/ — D°). Therefore, by the implicit function theorem 
(see for example Dieudonne (1969)), for any vq G [—no — 5,uq + 5], there is an open 
neighborhood U VQ of (^o,0) in M 2 and two regular functions functions on U VQ , r v ° and s v ° 
such that for any (£, v ) G U VQ we have 

F(((,v),(r v °((,v),s V0 ((,v)))=Q. 

Since the set [—no — 5, hq + 5] x {0} is compact in M 2 , we can find a finite sequence Vi,...,v m 
such that the finite sequence of sets (U Vj ) is a finite open cover of [—no — 5, uq + 5] x {0} . We 
now define the functions r and s in the tubular neighborhood UjU Vj of [—no — 6, no + 5] x {0} 
by 

r(C,«) = r^(C,«), a(C,«) = ^(C,«), far(C,«) € *V 

This definition is consistent since if (£, u) G U v .f]U Vi with I ^ j we have r 1 "- 1 (C, u) = (^, u) 
and (£, i?) = (C, w) by the uniqueness of the solution in the implicit function theorem. 
The last assertion of the proposition follows also from the uniqueness of the solution in the 
implicit function theorem. □ 

It follows immediately from the above result that the vector field X((, ■ ) vanishes in a 
small enough neighborhood of To if and only if 

{P2(y),X(C,M(y,r c (y),8 C (v))))=0, 

which at a given £ is an equation for v. 

We analyze a neighborhood of the point £ = 0. We first observe that 

, X(0, M(v, r (v), 8 (v)))) = (p 2 (v) , X(0, M(v, 0, 0))) 
= (/3 2 (v),X(0,T (v))))=0. 



Therefore by the Malgrange preparation Theorem Golubitsky and Guillemin (1973) (the 
Weierstrass preparation Theorem in the analytic setting), we can write 

(&(«) , X(C, M(v, r c (v),s c (v)))) = ( 2 h(C, v) + (g(v) . (B.2) 



Lemma B.3 The function g in ( |B,2 ) is given by 

g(v) = (P2(v),d c X(0,T (v))) . 
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Proof We have 

9(v) = /&(«), {d c X(C,M(v,r c (v),s c (v)))^ 
= (P 2 (v),d c X(0,T (v))) + /p 2 (v),DX(0,T (v)) d r M(v,0,0) Uj-Jv)) 
+ ((3 2 (v), DX(0,T (v)) d s M(v,0,0) d c s c (v)) 



C=o 



C=o 



(P 2 (v) , d ( X(0,T (v)) + ( fa(v) , DX(0,T (v)) e x {v) d ( r c (v)) 



C=o 

+(fc(v) , DX(0,T (v)) e 3 (v) d c s c (v))\\ . 

The lemma follows at once from Proposition |B,1| □ 

The following result gives conditions for the perturbed vector field to have only two fixed 
points near To- 

Theorem B.4 Assume the function 

g(v) = ((3 2 (v),d c X(0,r (v))) . 

satisfies dg/dv(±no) 7^ and does not vanish in (—no, no). Then for |£| small enough (but 
non zero), the vector field X has only two zeros in a tubular neighborhood ofT$. These 
zeros are {nAA{C)i 0> 0)) and (0, 0, n aa (C)) with uaa(C) an d n aa(() regular near C, = and 
nAA(0) = n aa (0) = n . 

As we will see in the proof g(±no) = and the condition dg/dv(±no) ^ ensures that 
these zeros are isolated. 

Proof We observe that 

Z(C,r (-n )) =X(C,n ,0,0)) , 

hence 

d c A^(C,r (-n )) = d ( X z (C,T (-n )) = . 
On the other hand, by a direct computation one gets 

/Sz(-no) = I 1 

and we get g{— no) = . Similarly one has g{no) = 0. 
Since the functions g and h in ( |B,2 ) are regular, for |C| small, it follows that the function 



v — * (fl2(v) , X (C, M(v, r^(v),st(v))) can vanish only in neighborhoods of points where g 
vanishes. We conclude that if g does not vanish on the open interval ] — no,no[, and 

*<=*,) *o, 
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there is a number 5' > such that for |C| small enough non zero, the function v —> 
(fl2( v ) j ^(C; M(v, r^(v), s^(v))) has at most two zeros in the interval [—no — 5', uq + 5']. 
Such zeros must be simple and near ±no- By Theorem |3.3| we conclude that these two 
zeros exist and are the two fixed points aa and AA respectively. □ 



C Applications to the process of mutant substitution 



Recall that in our setting, the resident population is monomorphic with genotype (uai ua)- 
The mutant allelic trait u a is given by 



u„ 



ua + C, 

where £ has been chosen according to the distribution m a (uA, h)dh and therefore \Q\ < a. 



C.l The stable manifold of the AA fixed point. 



As we have seen before in Theorem 3.3 the stability of the fixed point A A can be decided 
by looking at the fitness of the mutant. We will need later on a property of the stable 
manifold in the case where this fixed point is unstable. 

Theorem C.l For \Q\ small enough, if Sacl,Aa(C) > 0> ^ e l° ca l stable manifold of the 
unstable fixed point AA intersects the closed positive quadrant only along the line y = z = 0. 
The local unstable manifold is contained in the curve T^. 



Proof Hyperbolicity follows from Theorem 3.3 and we can apply Theorem 5.1 in Hirsh, 



Pugh and Shub, M. (1977). From Theorem 3.3 one finds that the Jacobian matrix DXaa 



has three eigenvectors 



£a(C) = e x (-no) + 0(0, E 2 (() = e2 (-n ) + O(C) , E 3 (() = e 3 + 0(C) 
with respective eigenvalues D° - f + 0(Q, 0(C), -/ + 0(C)- 



manifold W s / A ° c of AA is a piece of regular manifold tangent in AA to the two dimensional 
affine stable subspace ^^(C with origin in AA, and spanned by the vectors -E'i(C) anci 
E 3 (0- 

The x axis {y = z = 0) is invariant by the vector field and is contained in the stable 
manifold. The first result follows from the fact that £^(C) intersects the closed positive 
quadrant only along the line y = z = 0. 

Since the local (one dimensional) unstable manifold W^'i oc (C) of AA is tangent to the 
linear unstable direction in ^(C) m AA, it is enough to show that this direction points 
inside the quadrant. This follows immediately from the expression of ^(C)- By uniqueness 
of the invariant curve (see Theorem 5.1 in Hirsh, Pugh and Shub, M. ( |1977 )), we conclude 
that W^a C {C,) C r^, and the result follows by the invariance of the positive quadrant by 
the flow. □ 



It follows from Theorem 5.1 in Hirsh, Pugh and Shub, M. (1977) that the local stable 
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C.2 Invasion and fixation conditions. 

Recall that the functions f(u±,U2), D{u\,U2) and C((ui, U2), (vi, V2)) are symmetric in 
(ui, U2) and {v\, v^)- Since u a = ua + C we have 

Iaa = f(UA, ua) , fAa = f(uA + C, u A ) , etc., 

and 

fAa = fAA + \^ + 0{e), faa = fAA+ d ^C + 0(C 2 ), etc. 

After some elementary computations one gets 

< \ 1 dSAa,AA / n x / 2 2 \ 

= Tt — (°) ( u ~ • 

Therefore, if 

^^(0)^0 
etc, 

the function g vanishes only for v = ±uaa, an d the vector field X((, . ) has for small |£| 7^ 
only two fixed points near the intersection of the curve To with the positive quadrant (these 
fixed points are on the lines x = y = and z = y = 0). 

Note that at neutrality we have Sa<i,Aa{0) = = SA a ,aa(ty, hence 

SAaMo = d ^ A mc+o(e), 

ans similarly for S A a,aa(0- 

Hence, if aOA ^ AA (0) / 0, for \C\ small enough, the stability of AA is determined by the 
sign of dSA ^ AA (0)^ (and similarly for aa). 
By a direct computation, one gets 

dS A a,AA / n \ _ dSAa,aa /„\ 

d( 1 ' ~ d^ [ ' ' 

Hence the two fixed points have opposite stability, therefore if invasion occurs it implies 
fixation. The fixed point AA is stable (the mutant does not invade) if ( and dSAa,AA/d((0) 
have opposite sign. 

We now summarize these results. We denote by the piece of contained in the positive 
quadrant. 

Theorem C.2 For ( non zero of small enough modulus, if ( dSAa,AA/d((0) > (which 
implies dSAa,AA/d((0) 7^ 0) the fixed point AA is unstable and we have fixation for the 
macroscopic dynamics. 

More precisely, the curve is the piece of unstable manifold between AA and aa. There 
exists an invariant tubular neighborhood Y ofT^ such that the orbit of any initial condition 
in Y converges to aa. 

If £ dSAa,AA/d((Q) < 0, the fixed point AA is stable and the mutant disappears in the 
macroscopic dynamics. 
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Proof The result follows immediately from Theorem 3.3, Theorem C.l and Theorem 
El □ 



The last results of this section concern the proof of Theorem 4.7 Indeed we want to prove 
the monotonicity of the function 



F w (t) 



(M(t) , W(0) 



(M(t) , 1) 

Here M(t) denotes a trajectory of the vector field X(£, ■ ), namely 

dM 



(C.l) 



dt 



X((,M) 



in other words M{t) = (pt(Mo), and W(£) is a three dimensional vector depending contin- 
uously on £. We denote by 1 the vector with all components equal to one. 



Proposition C.3 Assume 



inf 

v£[— no, no] 



> 



Then for any |£| sufficiently small, under the hypothesis of Theorem C.2 if Mq is close 



enough to the curve T^, the function Fw(t) is strictly monotone. The same result holds if 
W(0) is proportional to 1 and 



inf 

ve[~n ,n ] 



dT dW 

dv ' dQ 



(0) 



> 



Proof We have 

dF w 
dt 



1 ,U)-^M,f(() 



(M(t) , 1) 



(M(t) , 1) 



Since the invariant curve is transversally attracting, it is enough to consider a point 
M G r^. If s denotes the curvilinear abscissa of the curve T^, we have for any s 



x(c,r c ( s )) = ||x(c,r c ( s ))| 



dT c 
ds 



Therefore on the invariant curve (M(t) = T^(s) for a certain s which depends on t), 



1 



(M(t) , 1) 



X(M)-^1^M,W(Q 
(M(t) , 1) 



\X( C ,Tds))\\/^_(dT c/d s 1) 
(T C W,1) \ ds (T f W,l) , 
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By Theorem 4.1 in Hirsh, Pugh and Shub, M. (1977) we have 



lim 



dT c _ dT. 



o 



C->o ds ds t/4u 2 (s) + 2n§ 



-2v(s) 
K v(s) + n . 



where 



dv 



ds y / Av 2 (s) +2n 2 Q 
By a direct computation, one can check that 



lim 

C^o \ ds 



C 1 







and the first part of the result follows from Theorem C.2 
If W(0) = jl for some real number 7, we have 

dW 



Therefore 



(M(t) , 1) 



^(C) = 7i + C^Ho) + 0(C 2 ). 



X(M)-^1^M,W(C) 



(M(t) , 1) 



(Tf (») , 1) x 
and the result follows as before. 



□ 



Consider now the average phenotypic trait <fi. This corresponds to the vector 

(f>(uA,UA) \ I 4>(u A ,u A ) 
W 4> {mut) = [ <t>{uA,u a ) = 4>(u A ,u A + () 

(f)(u a ,Ua) J \ <p(u A + (,UA + C) 

o{a A .„ A ) I 1 I + r'' V ' A - A ' I 1/2 ] +C^ 2 ) 




1 

Corollary C.4 The function is strictly monotonous for |£| small enough. 
Proof One gets 

'dT dW A ,„A I 1 



dv ' d( 



-(0) 





and by Proposition C.3 we get the monotonicity in time of the average phenotypic trait. 

□ 
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D Proof of Theorem 



4.5 



The proof of the theorem will essentially follow the same steps as the ones of the proof of 



Theorem 1 in Champagnat Champagnat (2006) and of the Appendix A in Champagnat 



and Meleard (2011). We will not repeat the details and we will restrict ourselves to the 



steps that must be modified. The proof is based on intermediary results that we state now. 



Proposition D.l Assume that for K > 1, Supp(isq) = {AA,Aa,aa} and 



lim {(v Q , t AA ), (u , t Aa ), , W) = (xo, 2/0, z ) G V{ 

K — >oo 



a.s., where V£ is defined in Theorem C.2 Then for all T > 



lim sup 

K^oo t e [o,T] 



a.s, 



(D.l) 



and similarly for Aa and aa, where ift is the flow of the vector field (3.6). 



The proof of this result can be obtained following a standard compactness- uniqueness result 



(see Ethier and Kurtz (1986) or Fournier and Meleard (2004)) and using Theorem C.2 



Proposition D.2 Let Suppiy^) = {AA} and let t\ denote the first mutation time. For 
any sufficiently small e > 0, if (uq , Iaa) belongs to the ^-neighborhood ofn A A = ^Caaaa* > 

the time of exit of (u^' K , t AA ) from the e -neighborhood of n AA is bigger than e VK Ari with 
probability converging to 1. 

Moreover, there exists a constant c such that for any sufficiently small e > 0,the previous 
result still holds if the death rate of an individual with genotype AA 



D A A + C aa.aa , Iaa 



(D.2) 



is perturbed by an additional random process that is uniformly bounded by ce. 



Such results are standard (cf. Champagnat (2006)). The first part of this proposition 
is an exponential deviation estimate on the so-called "exit from an attracting domain" 



(Freidlin and Wentzel (1984)). It is used to prove that when the first mutation occurs, the 



population density has never left the e-neighborhood of n AA . When a mutation a occurs, 

<T,K ti \ i f—i I cr,K 



the additional term in (D.2) is C AAjAa {u t ' , t Aa ) + C AA: a ak v t ' > ^-aa) which is smaller that 



Ce if (zv" , l Aa ) + lyl* , t aa ) < e. 



From these results, one can deduce the following proposition, already proved in Champag- 
natl (120061). 
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Proposition D.3 Let Supp(vq) = {^4^4} and letr\ denote the first mutation time. There 
exists £q such that if (v^ , 1) belongs to the e^-neighborhood of haa, then for any e < Eq, 

Jim P k (t 1 > In if, sup 1) - n AA \ < e) = 1, 



K-+oc \ te[lnK,ri] 

and K [IkT\ converges in law (when K tends to infinity) to a random variable with expo- 
nential law with parameter 2 f aa PAA ^AA, that is for any t > 0, 

lim P K (n > -J—] = exp(-2 paa fAAfiAAt). 

Then, if In K <C k 1 ^ k , we deduce that lim^^oo ¥ K (t\ < In K^j = and that for any e > 

lim P K ( sup \(v?' K , 1) - u A a\ > e) = 0. 

K-+oo V te[0,n] ' 



Let us define two stopping times which describe the first time where the process arrives in 
a e-neighborhood of a stationary state of the dynamical system. 



ta = r A (e, K) = inf{t > 0, (y a t ,K , l aa ) = {v°' K , l Aa ) = ; (f t CT,A ', Iaa) - ™Aa\ < e}, 

(D.3) 

r a = T a (e, K) = ml{t > 0, , l aa ) - n aa \ < e ; {v%' K , l^o) = {^t' K > ^AA) = 0}. 

(D.4) 

Note that ta is the extinction time of the population with alleles a and fixation of the 
allele A and that r a is the extinction time of the population with allele A and fixation of 
the allele a. 



Proposition D.4 Recall that the Sao,,AA has been defined in (3.7). Let (zk) be a sequence 
of integers such that converges to uaa- Then 

lim lim P* +1 . ( Ta < t a ) = [SAa ; AA]+ (D.5) 

lim lim P* a . (t A < r a ) = 1 - (D.6) 



V ?? > 0, lim lim P^. i r a A r A > — ^- A n = 0. (D.7) 



Proof The proof is inspired by the proof of Lemma 3 in Champagnat (2006). We 
introduce the following stopping times. 

Rf = inf{i > ; \{v? K , 1 A a) - uaa\ > e}, 

= inf{i > ; {u? K , l Aa ) + laa) > e}. 
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is the time of drift of the resident population AA away from its equilibrium, is 
the time of invasion of the mutant allele a, either if the population with genotype Aa is 
sufficiently large or the one with genotype aa. 



Assume that (v^ , Iao) = Using Proposition D.2 second part, one can prove as in 



Champagnat (|2006) that there exist p, V > such that, for K large enough, 

— — < ti ) > 1 - e and P(Sf A t x A e KV < ) > 1 - e. 
Kuk J 



Then, on [0, t\ A A Rff], one has uaa — £ < ( v t ' K ■> ^-Aa) < nAA + e and (z^' , 1^) < 

t 



e, (ff^, loo) < e. 



Using (3.3), (3.4) and by minorizing or majorizing the birth and death rates, it can be easily 
checked that, for K large enough, almost surely, the process ((u^' K , l{Aa})> ( u t' R ' ■> l{aa})) 
is stochastically lower-bounded and upper-bounded by two normalized bi-type branching 

A 1 A 11,£ A 12 ' e a 2 A 21 ' e A 22,e 

processes = (-fc-, -^) te R + and = (-^, -^)t eR+ . 

The branching processes A 1 and A 2 have initial condition (1,0) and birth rates for a state 
(y, z) of the form (for i = 1,2), 

y 

N Aa(. e ,Vi z ) = fAaV + 2f aa z + oi(e)(y + z) ; iV* a (e, y, z) = ^ + /a» z ) 02(e), 
and death rates 

M l Aa {e, y, z) = (D Aa + Ca^aa^aa) V + 03(e) (y + z) , 
M l aa (e, y, z) = (D aa + C aai AAnAA) z + 04(e) (y + z) . 
Moreover we can check that the Oj(e) don't depend on K. 

Let us denote by q\(t) and q\{t) the probabilities of extinction of the process A* before 
time t, starting respectively from (1,0) or (0,1). These probabilities correspond to the 



extinction of the allele a. Using the generating function, it can be proved (see Athreya and 



Ney (1972)) that the vector q l (t) is solution of the differential system q l = Y l (e,q l ) where 



the vector field Y l is of class C 2 and 

fAa q\ + {DAa + C ' Aa,AA ^Aa) ~ (fAa + D Aa + C ' Aa,AA n AA) <7l 



Y (0, (qi, q2)) y 2 f aa qiQ2 + (£) aa + C aCL) AA nAA) - (2/ oa + D aa + C aCL) AA n A A) <72 
Note that this vector is independent of i. □ 



Lemma D.5 For any e > small enough, we have the following properties, 
i) The vector field Y % (e, ■ ) vanishes at the point Mq = (1, 1). 

ii) If SAa,AA < 0, this fixed point is stable, and the trajectory emanating from the origin 
converges to this fixed point. 
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iii) If Sao,,AA > 0, this fixed point is unstable. There is another fixed point 

I DAcl + CAa,AA nAA 

fAa 

fAa {Dgg + C aa ,AA KAa) 

\ (2 fAa faa + Daa + C aa ,AA ^Aa) ~ 2 faa {DAa + C ' Aa,AA ^Aa) 



PI 



+ O l (s), 



which is stable and the trajectory emanating from the origin converges to this fixed 
point. 

Proof Assertion i) follows by a direct computation. 

The difference between Y l (e, • ) and Y (0, • ) is of order e in C 2 . The first parts of assertions 
ii) and iii) follow at once from the similar results for Y (0, • ) and the stability of hyperbolic 
fixed points (see for example Guckenheimer and Holmes] (1983)), Note that in case iii), 



2 faa qi ~ (2 faa + D aa + C aa> AA ^Aa) < 0, 



since q\ £ [0, 1]. 



We now prove the second part of case ii). Let <&f denote the flow of the vector field Y(e, ■ ). 
Since the fixed points Mq is stable for Y(Q, •), there is a number ro > 0, such that for 
any e > small enough, the ball B ro (Mo) centered in Mq and of radius ro is attracted 
to the fixed point Mq by the flow $f. Let To > denote the smallest time such that 
$?(((), 0)) G B ro/2 (M ). This time is finite since Y(0, (0,0)) / 0, qi(t) = $?((0,0)) 1 
converges to 1 when t tends to infinity, and 1^(0, (gi, (ft)) is linear in q 2 . By continuity in 
e of the map 



sec 



Guckenheimer and Holmes (1983)), we conclude that for any e > 
small enough, $^((0,0)) E B To {Mq). The second part of assertion ii) follows. 

The second part of assertion iii) is proved by similar arguments, noting that the fixed point 
P £ depends continuously in e. □ 



We conclude the proof of Proposition 



D.4 by similar arguments as in Champagnat (2006) 



or in Champagnat and Meleard (2011 



using Theorems C.l and C.2 
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